Vision-Centric Bird-Eye-View (BEV) perception has shown promising potential and attracted increasing attention in autonomous driving. Recent works mainly focus on improving efficiency or accuracy but neglect the domain shift problem, resulting in severe degradation of transfer performance. With extensive observations, we figure out the significant domain gaps existing in the scene, weather, and day-night changing scenarios and make the first attempt to solve the domain adaption problem for multi-view 3D object detection. Since BEV perception approaches are usually complicated and contain several components, the domain shift accumulation on multi-latent spaces makes BEV domain adaptation challenging. In this paper, we propose a novel Multi-level Multi-space Alignment Teacher-Student ($M^{2}ATS$) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Multi-space Feature Aligned (MFA) student model. Specifically, DAT model adopts uncertainty guidance to sample reliable depth information in target domain. After constructing domain-invariant BEV perception, it then transfers pixel and instance-level knowledge to student model. To further alleviate the domain shift at the global level, MFA student model is introduced to align task-relevant multi-space features of two domains. To verify the effectiveness of $M^{2}ATS$, we conduct BEV 3D object detection experiments on four cross domain scenarios and achieve state-of-the-art performance (e.g., +12.6% NDS and +9.1% mAP on Day-Night). Code and dataset will be released.
translated by 谷歌翻译
机器人操作系统(ROS)为涉及生产任务,提高生产力和简化人类运营的各个领域的自动化带来了极大的自动化潜力。但是,ROS高度依赖交流,但缺乏安全的数据共享机制。确保多机器人之间的机密数据交换在多机器人交互中提出了重大挑战。在本文中,我们介绍了Authros,这是一个安全且方便的授权框架,用于ROS节点,具有绝对安全性和基于私人以太坊网络和SM算法的高可用性。据我们所知,Authros是装有ROS的机器人的第一个安全数据共享框架。该框架可以满足ROS节点之间交换机密数据的不可变性和安全性的要求。此外,提出了授权和身份验证的机制,以在没有第三方的情况下进行原子执行以确保值得信赖的数据交换。 SM2密钥交换和SM4授权加密机制均已提出用于数据传输安全性。还实施了数据摘要上传方案,以提高以太坊网络上数据查询和上传的效率。实验结果表明,它可以从6.34ms的800KB加密数据中生成摘要。通过安全分析,Authros实现了安全的数据交换,数据操作检测和节点锻造攻击保护。
translated by 谷歌翻译
最近,深层神经网络(DNNS)用于减少带宽并提高互联网视频交付的质量。现有的方法训练服务器上每个视频块的相应内容超级分辨率(SR)模型,并将低分辨率(LR)视频块以及SR模型一起流到客户端。尽管他们取得了令人鼓舞的结果,但网络培训的巨大计算成本限制了其实际应用。在本文中,我们提出了一种名为有效元调整(EMT)的方法,以降低计算成本。 EMT没有从头开始训练,而是将元学习的模型适应了输入视频的第一部分。至于以下块,它通过以前的改编模型的梯度掩盖选择了部分参数。为了实现EMT的进一步加速,我们提出了一种新颖的抽样策略,以从视频帧中提取最具挑战性的补丁。拟议的策略高效,带来了可忽略的额外成本。我们的方法大大降低了计算成本并取得更好的性能,为将神经视频传递技术应用于实际应用铺平了道路。我们基于各种有效的SR架构进行了广泛的实验,包括ESPCN,SRCNN,FSRCNN和EDSR-1,证明了我们工作的概括能力。该代码通过\ url {https://github.com/neural-video-delivery/emt-pytorch-eccv2022}发布。
translated by 谷歌翻译
本报告介绍了在CVPR 2022上提交通用事件边界检测(GEBD)挑战中使用的算法。在这项工作中,我们改善了GEBD的现有结构化上下文变压器(SC-Transformer)方法。具体而言,在变压器编码器后,添加了变压器解码器模块以提取高质量的框架功能。最终分类是根据原始二进制分类器和新引入的多类分类器分支共同执行的。为了丰富运动信息,将光流作为新模式引入。最后,模型合奏用于进一步提高性能。所提出的方法在动力学-GEBD测试集上获得了86.49%的F1得分。与先前的SOTA方法相比,它提高了2.86%的F1分数。
translated by 谷歌翻译
由于计算的未来是异质的,因此可伸缩性是单图超分辨率的关键问题。最近的工作尝试训练一个网络,该网络可以部署在具有不同能力的平台上。但是,他们依靠像素稀疏卷积,这不是硬件友好,并且实现了有限的实际加速。由于可以将图像分为各种恢复困难的斑块,因此我们提出了一种基于自适应贴片(APE)的可扩展方法,以实现更实用的加速。具体而言,我们建议训练回归器,以预测贴片每一层的增量能力。一旦增量容量低于阈值,贴片就可以在特定层中退出。我们的方法可以通过改变增量容量的阈值来轻松调整性能和效率之间的权衡。此外,我们提出了一种新的策略,以实现我们方法的网络培训。我们在各种骨架,数据集和缩放因素上进行了广泛的实验,以证明我们方法的优势。代码可从https://github.com/littlepure2333/ape获得
translated by 谷歌翻译
生物医学网络上的自我监督的代表学习(SSL)为药物发现提供了新的机会,这些机会缺乏可用的生物或临床表型。但是,如何有效地结合多个SSL模型是具有挑战性的并且很少探索。因此,我们提出了对药物发现的生物医学网络的自我监督代表学习的多任务联合策略,命名为MSSL2DRUG。我们设计了六种基本的SSL任务,这些任务受到各种方式特征,包括生物医学异构网络中的结构,语义和属性,包括结构,语义和属性。此外,通过两种药物发现场景中的基于图表的对抗的对抗性多任务学习框架评估了多份任务的十五个组合。结果表明了两个重要的发现。 (1)与其他多任务联合策略相比,多模式任务的组合实现了最佳性能。 (2)本地和全球SSL任务的联合培训比随机任务组合产生更高的性能。因此,我们猜想多式联运和本地全球组合策略可以被视为多任务SSL对药物发现的指导。
translated by 谷歌翻译
随着深度神经网络(DNN)的发展,已经提出了用于单图像超分辨率(SISR)的基于DNN的大量方法。然而,现有方法主要在均匀采样的LR-HR补丁对上培训DNN,这使得它们无法在图像中完全利用信息贴片。在本文中,我们提出了一种简单而有效的数据增强方法。我们首先设计启发式指标来评估每个补丁对的信息性重要性。为了降低所有补丁对的计算成本,我们进一步建议通过积分图像来优化我们的度量计算,从而实现大约两个数量级加速。训练补丁对根据他们的方法对我们的方法进行了抽样。广泛的实验表明,我们的采样增强可以一致地提高收敛性,并提高各种SISR架构的性能,包括跨不同缩放因子(X2,X3,X4)的EDSR,RCAN,RDN,SRCNN和ESPCN。代码可在https://github.com/littlepure2333/samplingaug上获得
translated by 谷歌翻译
Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently execute them on resourcerestricted devices. To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large "teacher" BERT can be effectively transferred to a small "student" Tiny-BERT. Then, we introduce a new two-stage learning framework for TinyBERT, which performs Transformer distillation at both the pretraining and task-specific learning stages. This framework ensures that TinyBERT can capture the general-domain as well as the task-specific knowledge in BERT. TinyBERT 41 with 4 layers is empirically effective and achieves more than 96.8% the performance of its teacher BERT BASE on GLUE benchmark, while being 7.5x smaller and 9.4x faster on inference. TinyBERT 4 is also significantly better than 4-layer state-of-the-art baselines on BERT distillation, with only ∼28% parameters and ∼31% inference time of them. Moreover, TinyBERT 6 with 6 layers performs on-par with its teacher BERT BASE .
translated by 谷歌翻译
Most semantic communication systems leverage deep learning models to provide end-to-end transmission performance surpassing the established source and channel coding approaches. While, so far, research has mainly focused on architecture and model improvements, but such a model trained over a full dataset and ergodic channel responses is unlikely to be optimal for every test instance. Due to limitations on the model capacity and imperfect optimization and generalization, such learned models will be suboptimal especially when the testing data distribution or channel response is different from that in the training phase, as is likely to be the case in practice. To tackle this, in this paper, we propose a novel semantic communication paradigm by leveraging the deep learning model's overfitting property. Our model can for instance be updated after deployment, which can further lead to substantial gains in terms of the transmission rate-distortion (RD) performance. This new system is named adaptive semantic communication (ASC). In our ASC system, the ingredients of wireless transmitted stream include both the semantic representations of source data and the adapted decoder model parameters. Specifically, we take the overfitting concept to the extreme, proposing a series of ingenious methods to adapt the semantic codec or representations to an individual data or channel state instance. The whole ASC system design is formulated as an optimization problem whose goal is to minimize the loss function that is a tripartite tradeoff among the data rate, model rate, and distortion terms. The experiments (including user study) verify the effectiveness and efficiency of our ASC system. Notably, the substantial gain of our overfitted coding paradigm can catalyze semantic communication upgrading to a new era.
translated by 谷歌翻译
最近,图形神经网络(GNN)显着提高了图形上机器学习任务的性能。但是,这一技术突破使人们感到奇怪:GNN如何做出这样的决定,我们可以高度信心信任它的预测吗?当涉及到一些关键领域(例如生物医学)时,做出错误的决策可能会产生严重的后果,在应用它们之前解释GNN的内部工作机制至关重要。在本文中,我们为遵循消息传递方案GnnInterPreter的不同GNN的新型模型模型级解释方法提出了一种新颖的模型级解释方法,以解释GNN模型的高级决策过程。更具体地说,通过图形的连续放松和重新聚集技巧,GnnInterPreter学习了概率生成图分布,该分布在GNN模型的眼中生成了目标预测的最具代表性图。与唯一的现有作品相比,GnnInterPreter在生成具有不同类型的节点功能和边缘功能的解释图时更加有效,更灵活,而无需引入另一个Blackbox来解释GNN,而无需特定领域的知识。此外,在四个不同数据集上进行的实验研究表明,当模型是理想的情况下,GnnInterPreter生成的解释图可以匹配所需的图形模式,并揭示了如果存在任何模型。
translated by 谷歌翻译